Goto

Collaborating Authors

 generate sample


Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement

Neural Information Processing Systems

We explore the methodology and theory of reward-directed generation via conditional diffusion models. Directed generation aims to generate samples with desired properties as measured by a reward function, which has broad applications in generative AI, reinforcement learning, and computational biology. We consider the common learning scenario where the dataset consists of majorly unlabeled data and a small set of data with noisy reward labels. Our approach leverages a learned reward function on the smaller data set as a pseudolabeler to label the unlabelled data. After pseudo-labelling, a conditional diffusion model (CDM) is trained on the data and samples are generated by setting a target value $a$ as the condition in CDM. From a theoretical standpoint, we show that this directed generator can effectively learn and sample from the reward-conditioned data distribution: 1. our model is capable of recovering the data's latent subspace representation.


Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task

Neural Information Processing Systems

Modern generative models exhibit unprecedented capabilities to generate extremely realistic data. However, given the inherent compositionality of the real world, reliable use of these models in practical applications requires that they exhibit the capability to compose a novel set of concepts to generate outputs not seen in the training data set. Prior work demonstrates that recent diffusion models do exhibit intriguing compositional generalization abilities, but also fail unpredictably. Motivated by this, we perform a controlled study for understanding compositional generalization in conditional diffusion models in a synthetic setting, varying different attributes of the training data and measuring the model's ability to generate samples out-of-distribution. Our results show: (i) the order in which the ability to generate samples from a concept and compose them emerges is governed by the structure of the underlying data-generating process; (ii) performance on compositional tasks exhibits a sudden emergence due to multiplicative reliance on the performance of constituent tasks, partially explaining emergent phenomena seen in generative models; and (iii) composing concepts with lower frequency in the training data to generate out-of-distribution samples requires considerably more optimization steps compared to generating in-distribution samples. Overall, our study lays a foundation for understanding emergent capabilities and compositionality in generative models from a data-centric perspective.





Unifying and extending Diffusion Models through PDEs for solving Inverse Problems

Dasgupta, Agnimitra, da Cunha, Alexsander Marciano, Fardisi, Ali, Aminy, Mehrnegar, Binder, Brianna, Shaddy, Bryan, Oberai, Assad A

arXiv.org Machine Learning

Diffusion models have emerged as powerful generative tools with applications in computer vision and scientific machine learning (SciML), where they have been used to solve large-scale probabilistic inverse problems. Traditionally, these models have been derived using principles of variational inference, denoising, statistical signal processing, and stochastic differential equations. In contrast to the conventional presentation, in this study we derive diffusion models using ideas from linear partial differential equations and demonstrate that this approach has several benefits that include a constructive derivation of the forward and reverse processes, a unified derivation of multiple formulations and sampling strategies, and the discovery of a new class of models. We also apply the conditional version of these models to solving canonical conditional density estimation problems and challenging inverse problems. These problems help establish benchmarks for systematically quantifying the performance of different formulations and sampling strategies in this study, and for future studies. Finally, we identify and implement a mechanism through which a single diffusion model can be applied to measurements obtained from multiple measurement operators. Taken together, the contents of this manuscript provide a new understanding and several new directions in the application of diffusion models to solving physics-based inverse problems.


Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement

Neural Information Processing Systems

We explore the methodology and theory of reward-directed generation via conditional diffusion models. Directed generation aims to generate samples with desired properties as measured by a reward function, which has broad applications in generative AI, reinforcement learning, and computational biology. We consider the common learning scenario where the dataset consists of majorly unlabeled data and a small set of data with noisy reward labels. Our approach leverages a learned reward function on the smaller data set as a pseudolabeler to label the unlabelled data. After pseudo-labelling, a conditional diffusion model (CDM) is trained on the data and samples are generated by setting a target value a as the condition in CDM.


Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task

Neural Information Processing Systems

Modern generative models exhibit unprecedented capabilities to generate extremely realistic data. However, given the inherent compositionality of the real world, reliable use of these models in practical applications requires that they exhibit the capability to compose a novel set of concepts to generate outputs not seen in the training data set. Prior work demonstrates that recent diffusion models do exhibit intriguing compositional generalization abilities, but also fail unpredictably. Motivated by this, we perform a controlled study for understanding compositional generalization in conditional diffusion models in a synthetic setting, varying different attributes of the training data and measuring the model's ability to generate samples out-of-distribution. Our results show: (i) the order in which the ability to generate samples from a concept and compose them emerges is governed by the structure of the underlying data-generating process; (ii) performance on compositional tasks exhibits a sudden "emergence" due to multiplicative reliance on the performance of constituent tasks, partially explaining emergent phenomena seen in generative models; and (iii) composing concepts with lower frequency in the training data to generate out-of-distribution samples requires considerably more optimization steps compared to generating in-distribution samples. Overall, our study lays a foundation for understanding emergent capabilities and compositionality in generative models from a data-centric perspective.


Reviews: Model-Powered Conditional Independence Test

Neural Information Processing Systems

This paper proposed a model powered approach to conduct conditional independent tests for iid data. The basic idea is to use nearest neighbor bootstrap to generate samples which follow a distribution close to the f {CI} and a classifier is trained and tested to see if it is able to distinguish the observation distribution and the nearest neighbor bootstrapped distribution. If the classification performance is close to the random guess, one fails to reject the null hypothesis that data follows conditional independence otherwise one accept the null hypothesis. In general, the paper is trying to address an important problem and the paper is presented in a clear way. It seems that the whole method can be decoupled into two major components.


Reviews: Bayesian Adversarial Learning

Neural Information Processing Systems

This paper proposes a Bayesian model for adversarial learning problem. Empirical studies on Fashion-MINST and traffic sign recognition show that the proposed methods is slightly better than other adversarial learning baselines. Below I list my concerns about the paper: For modeling, 1. This paper ignore a highly relevant work'Bayesian GAN' [1]. The non-cooperative game between'data generator' and'learner' established in this paper is almost the same as the vanilla GAN.